1.2 Data Cleaning

In the dataset,

Here, we can see that the sum of null values in the dataset is not significant, with revolving utility and public record bankruptcies having null values equating to less than 0.15% each of the entire dataset.

Thus, we can remove this values from the dataset.

Now, let's check the description of

Now we're done pre-processing our data!

2 Data Analysis

Lets look at the loan status of the loans in the dataset, split by their loan terms!

Results show that longer term loans have double the proportion of loans being charged off. Hence, our team has chosen to focus our analysis on the 60 month loans

Numeric Variables

2.1 Debt to Income Ratio

DTI Recommendation

2.2 Open Accounts

2.3 Annual Income

2.4 purpose

group up purposes:

2.5 Home Ownership

Rent Recommendation

2.6 subgrades

Lets explore Grade E onwards

Subgrade Recommendation

Appendix: Extra graphs and workings

DTI

DTI x Frequency of Charged Off for all loans + 36M

Open Accounts x Frequency of Charged Off for all loans + 36M

Annual Income

Annual Income X Frequency of Charged Off for all loans + 36M

Purpose

annual income x frequency of charged off based on all purposes (all loans)

annual income x frequency of charged off based on all purposes (36M)

Home ownership (all loans)

Bankruptcy (not used in report)